88 research outputs found
Triplet Attention Transformer for Spatiotemporal Predictive Learning
Spatiotemporal predictive learning offers a self-supervised learning paradigm
that enables models to learn both spatial and temporal patterns by predicting
future sequences based on historical sequences. Mainstream methods are
dominated by recurrent units, yet they are limited by their lack of
parallelization and often underperform in real-world scenarios. To improve
prediction quality while maintaining computational efficiency, we propose an
innovative triplet attention transformer designed to capture both inter-frame
dynamics and intra-frame static features. Specifically, the model incorporates
the Triplet Attention Module (TAM), which replaces traditional recurrent units
by exploring self-attention mechanisms in temporal, spatial, and channel
dimensions. In this configuration: (i) temporal tokens contain abstract
representations of inter-frame, facilitating the capture of inherent temporal
dependencies; (ii) spatial and channel attention combine to refine the
intra-frame representation by performing fine-grained interactions across
spatial and channel dimensions. Alternating temporal, spatial, and
channel-level attention allows our approach to learn more complex short- and
long-range spatiotemporal dependencies. Extensive experiments demonstrate
performance surpassing existing recurrent-based and recurrent-free methods,
achieving state-of-the-art under multi-scenario examination including moving
object trajectory prediction, traffic flow prediction, driving scene
prediction, and human motion capture.Comment: Accepted to WACV 202
UniHCP: A Unified Model for Human-Centric Perceptions
Human-centric perceptions (e.g., pose estimation, human parsing, pedestrian
detection, person re-identification, etc.) play a key role in industrial
applications of visual models. While specific human-centric tasks have their
own relevant semantic aspect to focus on, they also share the same underlying
semantic structure of the human body. However, few works have attempted to
exploit such homogeneity and design a general-propose model for human-centric
tasks. In this work, we revisit a broad range of human-centric tasks and unify
them in a minimalist manner. We propose UniHCP, a Unified Model for
Human-Centric Perceptions, which unifies a wide range of human-centric tasks in
a simplified end-to-end manner with the plain vision transformer architecture.
With large-scale joint training on 33 human-centric datasets, UniHCP can
outperform strong baselines on several in-domain and downstream tasks by direct
evaluation. When adapted to a specific task, UniHCP achieves new SOTAs on a
wide range of human-centric tasks, e.g., 69.8 mIoU on CIHP for human parsing,
86.18 mA on PA-100K for attribute prediction, 90.3 mAP on Market1501 for ReID,
and 85.8 JI on CrowdHuman for pedestrian detection, performing better than
specialized models tailored for each task.Comment: Accepted for publication at the IEEE/CVF Conference on Computer
Vision and Pattern Recognition 2023 (CVPR 2023
Retrieve Anyone: A General-purpose Person Re-identification Task with Instructions
Human intelligence can retrieve any person according to both visual and
language descriptions. However, the current computer vision community studies
specific person re-identification (ReID) tasks in different scenarios
separately, which limits the applications in the real world. This paper strives
to resolve this problem by proposing a new instruct-ReID task that requires the
model to retrieve images according to the given image or language
instructions.Our instruct-ReID is a more general ReID setting, where existing
ReID tasks can be viewed as special cases by designing different instructions.
We propose a large-scale OmniReID benchmark and an adaptive triplet loss as a
baseline method to facilitate research in this new setting. Experimental
results show that the baseline model trained on our OmniReID benchmark can
improve +0.5%, +3.3% mAP on Market1501 and CUHK03 for traditional ReID, +2.1%,
+0.2%, +15.3% mAP on PRCC, VC-Clothes, LTCC for clothes-changing ReID, +12.5%
mAP on COCAS+ real2 for clothestemplate based clothes-changing ReID when using
only RGB images, +25.5% mAP on COCAS+ real2 for our newly defined
language-instructed ReID. The dataset, model, and code will be available at
https://github.com/hwz-zju/Instruct-ReID
Implementation and performances of the IPbus protocol for the JUNO Large-PMT readout electronics
The Jiangmen Underground Neutrino Observatory (JUNO) is a large neutrino
detector currently under construction in China. Thanks to the tight
requirements on its optical and radio-purity properties, it will be able to
perform leading measurements detecting terrestrial and astrophysical neutrinos
in a wide energy range from tens of keV to hundreds of MeV. A key requirement
for the success of the experiment is an unprecedented 3% energy resolution,
guaranteed by its large active mass (20 kton) and the use of more than 20,000
20-inch photo-multiplier tubes (PMTs) acquired by high-speed, high-resolution
sampling electronics located very close to the PMTs. As the Front-End and
Read-Out electronics is expected to continuously run underwater for 30 years, a
reliable readout acquisition system capable of handling the timestamped data
stream coming from the Large-PMTs and permitting to simultaneously monitor and
operate remotely the inaccessible electronics had to be developed. In this
contribution, the firmware and hardware implementation of the IPbus based
readout protocol will be presented, together with the performances measured on
final modules during the mass production of the electronics
Mass testing of the JUNO experiment 20-inch PMTs readout electronics
The Jiangmen Underground Neutrino Observatory (JUNO) is a multi-purpose,
large size, liquid scintillator experiment under construction in China. JUNO
will perform leading measurements detecting neutrinos from different sources
(reactor, terrestrial and astrophysical neutrinos) covering a wide energy range
(from 200 keV to several GeV). This paper focuses on the design and development
of a test protocol for the 20-inch PMT underwater readout electronics,
performed in parallel to the mass production line. In a time period of about
ten months, a total number of 6950 electronic boards were tested with an
acceptance yield of 99.1%
Validation and integration tests of the JUNO 20-inch PMTs readout electronics
The Jiangmen Underground Neutrino Observatory (JUNO) is a large neutrino
detector currently under construction in China. JUNO will be able to study the
neutrino mass ordering and to perform leading measurements detecting
terrestrial and astrophysical neutrinos in a wide energy range, spanning from
200 keV to several GeV. Given the ambitious physics goals of JUNO, the
electronic system has to meet specific tight requirements, and a thorough
characterization is required. The present paper describes the tests performed
on the readout modules to measure their performances.Comment: 20 pages, 13 figure
Real-time Monitoring for the Next Core-Collapse Supernova in JUNO
Core-collapse supernova (CCSN) is one of the most energetic astrophysical
events in the Universe. The early and prompt detection of neutrinos before
(pre-SN) and during the SN burst is a unique opportunity to realize the
multi-messenger observation of the CCSN events. In this work, we describe the
monitoring concept and present the sensitivity of the system to the pre-SN and
SN neutrinos at the Jiangmen Underground Neutrino Observatory (JUNO), which is
a 20 kton liquid scintillator detector under construction in South China. The
real-time monitoring system is designed with both the prompt monitors on the
electronic board and online monitors at the data acquisition stage, in order to
ensure both the alert speed and alert coverage of progenitor stars. By assuming
a false alert rate of 1 per year, this monitoring system can be sensitive to
the pre-SN neutrinos up to the distance of about 1.6 (0.9) kpc and SN neutrinos
up to about 370 (360) kpc for a progenitor mass of 30 for the case
of normal (inverted) mass ordering. The pointing ability of the CCSN is
evaluated by using the accumulated event anisotropy of the inverse beta decay
interactions from pre-SN or SN neutrinos, which, along with the early alert,
can play important roles for the followup multi-messenger observations of the
next Galactic or nearby extragalactic CCSN.Comment: 24 pages, 9 figure
- …